/
/
/
Overview

OpenVINO™ toolkit: An open source AI toolkit that makes it easier to write once, deploy anywhere.

Choose a Preferred Package

You can customize the selections to fit your needs.

Sign up for the latest product releases, news, and tips.

OpenVINO™ Toolkit for AI PC

We're entering an era where AI-focused hardware and software advances make AI PC a reality. Intel provides highly optimized developer support for AI workloads by including the OpenVINO™ toolkit on your PC.

Seamlessly transition projects from early AI development on the PC to cloud-based training to edge deployment. More easily move AI workloads across CPU, GPU, and NPU to optimize models for efficient deployment. With OpenVINO, you can accelerate AI inference, achieve lower latency, and increase throughput while maintaining accuracy.

Hardware

Unlock AI features such as real-time language translation, automation inferencing, and enriched gaming experiences.

The Intel® Core™ Ultra processor accelerates AI on the PC by combining a CPU, GPU, and NPU through a 3D performance hybrid architecture, together with high-bandwidth memory and cache.

Intel® Core™ desktop processors optimize your gaming, content creation, and productivity.

Installation Guides

Expore additional configurations for GPUs and NPUs to get the most out of the OpenVINO toolkit.

Unlock the Power of Large Language Models (LLM)

Download this comprehensive whitepaper on LLM optimization using compression techniques. Learn to use the OpenVINO toolkit to compress LLMs, integrate them into AI applications, and deploy them on your PC with maximum performance.

Resources

Notebooks and Demos

Learn and experiment with the OpenVINO toolkit using these preconfigured Jupyter* Notebooks.

LLM Chatbot

Craft chatbots powered by an LLM using the OpenVINO toolkit.

LLM Instruction Following

Run an instruction-following text-generation pipeline.

Latent Consistency Models (LCM)

Learn about image generation using the LCM and the OpenVINO toolkit.

Distil-Whisper Model

Experience automatic speech recognition with this model and the OpenVINO toolkit.

Stable Diffusion* v2

Venture into text-to-image generation and infinite zoom capabilities with Stable Diffusion* v2 and the OpenVINO toolkit.

Bootstrapping Language-Image Pretraining (BLIP)

Use BLIP for visual language processing tasks like visual question answering and image captioning.

MusicGen

Discover a single-stage, autoregressive transformer model that produces high-quality music samples based on text descriptions or audio prompts.

YOLOv8* Optimization

Learn how to convert and optimize YOLOv8* models.

Learn how to convert and optimize YOLOv8* models.

Predict the 2D position and orientation of each person in an image or a video.

OpenVINO Notebook

Explore all available OpenVINO notebooks to unlock even more possibilities for optimized deep learning inference.

Blogs

Videos and Webinars

Embark on your AI development journey with beginner-friendly video tutorials. Gain valuable insights from experts and prepare to advance your skills.

OpenVINO Runtime Integration with Optimum*

Load optimized models from the Hugging Face Hub and create pipelines to run inference with OpenVINO Runtime without rewriting your APIs.

Learn More

AI PC Development

Discover how Intel® Core™ Ultra processors enable you to use the power of CPU, GPU, and NPU to accelerate AI development on the PC.

Get Started

The OpenVINO™ toolkit enables you to optimize a deep learning model from almost any framework and deploy it with best-in-class performance on a range of Intel® processors and other hardware platforms.

What's New in Version 2024.3

The OpenVINO™ toolkit 2024.3 release enhances generative AI (GenAI) accessibility with improved large language model (LLM) performance and expanded model coverage. It also boosts portability and performance for deployment anywhere: at the edge, in the cloud, or locally. The top features of this release are:

  • Models: Pre-optimized models are now available in Hugging Face*, making it easier for you to get started.
  • Optimizations: Significant improvement in LLM performance on discrete Intel® GPUs with the addition of multi-head attention (MHA), and enhancements from Intel® oneAPI Deep Neural Network Library (oneDNN).
  • Deployment: Improved CPU performance when serving LLMs with the inclusion of vLLM and continuous batching in model serving for the OpenVINO toolkit. vLLM is an easier-to-use open source library that supports efficient LLM inferencing and model serving. More Information
Release Notes View System Requirements

Easier Model Access and Conversion

Product

Details

New Model Support

Support for Phi-3-mini, a family of AI models that takes advantage of the power of small language models for faster, more accurate, and cost-effective text processing.

Llama 3 optimizations for CPUs, built-in GPUs, and discrete GPUs for improved performance and efficient memory usage.

Python*

A Python custom operation is now enabled in the OpenVINO toolkit, making it easier for Python developers to code their custom operations instead of using C++ custom operations (also supported). This custom operation empowers you to implement your own specialized operations into any model.

Generative AI and LLM Enhancements

Expanded model support and accelerated inference.

Product

Details

New Jupyter Notebooks

An expansion to Jupyter Notebooks ensures better coverage for new models. The following noteworthy notebooks were added:

  • DynamiCrafter
  • YOLOv10
  • Chatbot notebook with Phi-3 and Qwen2

Performance Improvements for LLMs

A GPTQ method for 4-bit weight compression was added to the Neural Networks Compression Framework (NNCF) for more efficient inference and improved performance of compressed LLMs.

There are significant LLM performance improvements and reduced latency for built-in and discrete GPUs.

More Portability and Performance

Develop once, deploy anywhere. OpenVINO toolkit enables developers to run AI at the edge, in the cloud, or locally.

Product

Details

Model Serving Enhancements

Preview: The OpenVINO model server now supports an OpenAI*-compatible API, continuous batching, and PagedAttention, which enables significantly higher throughput for parallel inferencing, especially on Intel® Xeon® processors that serve LLMs to many concurrent users.

The OpenVINO toolkit back end for the NVIDIA Triton* Inference Server now supports dynamic input shapes.

TorchServe was integrated through torch.compile on the OpenVINO toolkit back end for easier model deployment, provisioning to multiple instances, model versioning, and maintenance.

Intel Hardware Support

A significant improvement in second-token latency and memory footprint for FP16-weight LLMs on Intel® Advanced Vector Extensions 2 (for 13th gen Intel® Core™ processors) and Intel® Advanced Vector Extensions 512 (for 3rd gen Intel® Xeon® Scalable processors) that are based on CPU platforms, particularly for small batch sizes.

Preview: Support for the Intel® Xeon® 6 processor.

Generate API

Preview: Addition of Generate API, a simplified API for text generation using LLMs with only a few lines of code. The API is available through the newly launched OpenVINO Toolkit GenAI Package.

Stay Up-To-Date

Making Generative AI More Accessible for Real-World Scenarios​

OpenVINO™ toolkit is an open source toolkit that accelerates AI inference with lower latency and higher throughput while maintaining accuracy, reducing model footprint, and optimizing hardware use. It streamlines AI development and integration of deep learning in domains like computer vision, large language models (LLM), and generative AI.​

What's New in 2024.3
AI Programming Workshops for OpenVINO Toolkit

Learn with like-minded AI developers by joining live and on-demand webinars focused on GenAI, LLMs, AI PC, and more, including code-based workshops using Jupyter* Notebook.

How It Works

Convert and optimize models trained using popular frameworks like TensorFlow* and PyTorch*. Deploy across a mix of Intel® hardware and environments, on-premise and on-device, in the browser, or in the cloud.

Resources

Get started with OpenVINO and all the resources you need to learn, try samples, see performance, and more.

Get started

Unlock the Power of LLMs

Review optimization and deployment strategies using the OpenVINO toolkit. Plus, use compression techniques with LLMs on your PC.

Intel® Geti™ Platform

This is a commercial software platform that enables enterprise teams to develop vision AI models faster. With the platform, companies can build models with minimal data, and with OpenVINO integration, facilitate deploying solutions at scale.

Explore the Capabilities of the Intel® Geti™ Platform

AI Inference Software & Solutions Catalog

When you are ready to go to market with your solution, explore ISV solutions that are built on OpenVINO. This ebook is designed to help you find a solution that best addresses your use-case needs, organized into sections, such as banking or healthcare, to help you navigate the solutions table easier.

Explore the AI Inference Catalog

Toolkit Add-Ons

Take advantage of add-ons that extend the possibilities of the toolkit, and implement existing and new functionality now available in the core toolkit.

Benchmark Tool

Estimate deep learning inference performance on supported devices.

Dataset Management Framework

Use this add-on to build, transform, and analyze datasets.

Model Optimizer

This cross-platform, command-line tool facilitates the transition between training and deployment environments, performs static model analysis, and adjusts deep learning models for optimal performance on end-point target devices.

Neural Networks Compression Framework

Use this framework based on PyTorch for quantization-aware training.

Industry Model Zoos

Hugging Face* has a repository for the OpenVINO toolkit that provides resources and models aimed at optimizing deep learning models for inference on Intel hardware.

OpenVINO Model Server

This scalable inference server is for serving models optimized with the Intel® Distribution of OpenVINO™ toolkit.

Join Us on the Journey

Subscribe below to stay up to date with the latest Intel offerings.

All fields are required unless marked optional.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

By submitting this form, you are confirming you are age 18 years or older. Intel may contact me for marketing-related communications. To learn about Intel's practices, including how to manage your preferences and settings, you can visit Intel's Privacy and Cookies notices.

Resources

Community and Support

Explore ways to get involved and stay up-to-date with the latest announcements.

Get Started

Optimize, fine-tune, and run comprehensive AI inference using the included model optimizer and runtime and development tools.

Powered by oneAPI

The productive smart path to freedom from the economic and technical burdens of proprietary alternatives for accelerated computing.